31 research outputs found

    On the Use of KPCA to Extract Artifacts in One-Dimensional Biomedical Signals

    Get PDF
    Kernel principal component analysis(KPCA) is a nonlinear projective technique that can be applied to decompose multi-dimensional signals and extract informative features as well as reduce any noise contributions. In this work we extend KPCA to extract and remove artifact-related contributions as well as noise from one-dimensional signal recordings. We introduce an embedding step which transforms the one-dimensional signal into a multi-dimensional vector. The latter is decomposed in feature space to extract artifact related contaminations. We further address the preimage problem and propose an initialization procedure to the fixed-point algorithm which renders it more efficient. Finally we apply KPCA to extract dominant Electrooculogram (EOG) artifacts contaminating Electroencephalogram (EEG) recordings in a frontal channel.info:eu-repo/semantics/publishedVersio

    dAMUSE : a new tool for denoising and blind source separation

    Get PDF
    In this work a generalized version of AMUSE, called dAMUSE is proposed. The main modification consists in embedding the observed mixed signals in a high-dimensional feature space of delayed coordinates. With the embedded signals a matrix pencil is formed and its generalized eigendecomposition is computed similar to the algorithm AMUSE. We show that in this case the uncorrelated output signals are filtered versions of the unknown source signals. Further, denoising the data can be achieved conveniently in parallel with the signal separation. Numerical simulations using artificially mixed signals are presented to show the performance of the method. Further results of a heart rate variability (HRV) study are discussed showing that the output signals are related with LF (low frequency) and HF (high frequency) fluctuations. Finally, an application to separate artifacts from 2D NOESY NMR spectra and to denoise the reconstructed artefact-free spectra is presented also.info:eu-repo/semantics/publishedVersio

    Denoising using local projective subspace methods

    Get PDF
    In this paper we present denoising algorithms for enhancing noisy signals based on Local ICA (LICA), Delayed AMUSE (dAMUSE) and Kernel PCA (KPCA). The algorithm LICA relies on applying ICA locally to clusters of signals embedded in a high-dimensional feature space of delayed coordinates. The components resembling the signals can be detected by various criteria like estimators of kurtosis or the variance of autocorrelations depending on the statistical nature of the signal. The algorithm proposed can be applied favorably to the problem of denoising multi-dimensional data. Another projective subspace denoising method using delayed coordinates has been proposed recently with the algorithm dAMUSE. It combines the solution of blind source separation problems with denoising efforts in an elegant way and proofs to be very efficient and fast. Finally, KPCA represents a non-linear projective subspace method that is well suited for denoising also. Besides illustrative applications to toy examples and images, we provide an application of all algorithms considered to the analysis of protein NMR spectra.info:eu-repo/semantics/publishedVersio

    Exploring matrix factorization techniques for significant genes identification of Alzheimer’s disease microarray gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The wide use of high-throughput DNA microarray technology provide an increasingly detailed view of human transcriptome from hundreds to thousands of genes. Although biomedical researchers typically design microarray experiments to explore specific biological contexts, the relationships between genes are hard to identified because they are complex and noisy high-dimensional data and are often hindered by low statistical power. The main challenge now is to extract valuable biological information from the colossal amount of data to gain insight into biological processes and the mechanisms of human disease. To overcome the challenge requires mathematical and computational methods that are versatile enough to capture the underlying biological features and simple enough to be applied efficiently to large datasets.</p> <p>Methods</p> <p>Unsupervised machine learning approaches provide new and efficient analysis of gene expression profiles. In our study, two unsupervised knowledge-based matrix factorization methods, independent component analysis (ICA) and nonnegative matrix factorization (NMF) are integrated to identify significant genes and related pathways in microarray gene expression dataset of Alzheimer’s disease. The advantage of these two approaches is they can be performed as a biclustering method by which genes and conditions can be clustered simultaneously. Furthermore, they can group genes into different categories for identifying related diagnostic pathways and regulatory networks. The difference between these two method lies in ICA assume statistical independence of the expression modes, while NMF need positivity constrains to generate localized gene expression profiles.</p> <p>Results</p> <p>In our work, we performed FastICA and non-smooth NMF methods on DNA microarray gene expression data of Alzheimer’s disease respectively. The simulation results shows that both of the methods can clearly classify severe AD samples from control samples, and the biological analysis of the identified significant genes and their related pathways demonstrated that these genes play a prominent role in AD and relate the activation patterns to AD phenotypes. It is validated that the combination of these two methods is efficient.</p> <p>Conclusions</p> <p>Unsupervised matrix factorization methods provide efficient tools to analyze high-throughput microarray dataset. According to the facts that different unsupervised approaches explore correlations in the high-dimensional data space and identify relevant subspace base on different hypotheses, integrating these methods to explore the underlying biological information from microarray dataset is an efficient approach. By combining the significant genes identified by both ICA and NMF, the biological analysis shows great efficient for elucidating the molecular taxonomy of Alzheimer’s disease and enable better experimental design to further identify potential pathways and therapeutic targets of AD.</p

    Nonlinear projective techniques to extract artifacts in biomedical signals

    Get PDF
    Biomedical signals are generally contaminated with artifacts and noise. In case the artifacts dominate, the useful signal can easily be extracted with projective subspace techniques. Then, biomedical signals which often represent one dimensional time series, need to be transformed to multidimensional signal vectors for the latter techniques to be applicable. The transformation can be achieved by embedding an observed signal in its delayed coordinates. Using this embedding we propose to cluster the resulting feature vectors and apply a singular spectrum analysis (SSA) locally in each cluster to recover the undistorted signals. We also compare the reconstructed signals to results obtained with kernel-PCA. Both nonlinear subspace projection techniques are applied to artificial data to demonstrate the suppression of random noise signals as well as to an electroencephalogram (EEG) signal recorded in the frontal channel to extract its prominent electrooculogram (EOG) interference.info:eu-repo/semantics/publishedVersio

    Analyzing Gene Expression Profiles With Ica

    No full text
    High-throughput genome-wide measurements of gene transcript levels have become available with the recent development of microarray technology. Intelligent and efficient mathematical and computational analysis tools are needed to read and interpret the information content buried in those large scale gene expression patterns at various levels of resolution. But the development of such methods is still in its infancy. Modern machine learning and data mining techniques based on information theory, like independent component analysis (ICA), consider gene expression patterns as a superposition of independent expression modes which are considered putative independent biological processes. We focus on two widely used ICA algorithms to blindly decompose gene expression profiles into independent component profiles representing underlying biological processes. These exploratory methods will be capable of detecting similarity, locally or globally, in gene expression patterns and help to group genes into functional categories - for example, genes that are expressed to a greater or lesser extent in response to a drug or an existing disease

    First results on uniqueness of sparse non-negative matrix factorization

    No full text
    Sparse non-negative matrix factorization (sNMF) allows for the decomposition of a given data set into a mixing matrix and a feature data set, which are both non-negative and fulfill certain sparsity conditions. In this paper it is shown that the employed projection step proposed by Hoyer has a unique solution, and that it indeed finds this solution. Then indeterminacies of the sNMF model are identified and first uniqueness results are presented, both theoretically and experimentally

    A Generalized Eigendecomposition Approach Using Matrix Pencils to Remove Artifacts from 2D NMR Spectra

    No full text
    Multidimensional 1 H nmr spectra of biomolecules dissolved in light water are contaminated by an intense water artefact. We dis- cuss the application of the generalized eigenvalue decomposition (GEVD) method using a matrix pencil to explore the time structure of the signals in order to separate out the water artefacts. Simulated as well as experi- mental 2D NOESY spectra of proteins are studied. Results are compared to those obtained with the FastICA algorithm

    Delayed AMUSE - A tool for blind source separation and denoising

    No full text
    In this work we propose a generalized eigendecomposition (GEVD) of a matrix pencil computed after embedding the data into a high-dim feature space of delayed coordinates. The matrix pencil is computed like in AMUSE but in the feature space of delayed coordinates. Its GEVD yields filtered versions of the source signals as output signals. The algorithm is implemented in two EVD steps. Numerical simulations study the influence of the number of delays and the noise level on the performance

    KPCA denoising and the pre-image problem revisited

    Get PDF
    Kernel principal component analysis (KPCA) is widely used in classification, feature extraction and denoising applications. In the latter it is unavoidable to deal with the pre-image problem which constitutes the most complex step in the whole processing chain. One of the methods to tackle this problem is an iterative solution based on a fixed-point algorithm. An alternative strategy considers an algebraic approach that relies on the solution of an under-determined system of equations. In this work we present a method that uses this algebraic approach to estimate a good starting point to the fixed-point iteration. We will demonstrate that this hybrid solution for the pre-image shows better performance than the other two methods. Further we extend the applicability of KPCA to one-dimensional signals which occur in many signal processing applications. We show that artefact removal from such data can be treated on the same footing as denoising. We finally apply the algorithm to denoise the famous USPS data set and to extract EOG interferences from single channel EEG recordings
    corecore